Conditions on Consistency of Probabilistic Tree Adjoining Grammars

نویسنده

  • Anoop Sarkar
چکیده

Much of the power of probabilistic methods in modelling language comes from their ability to compare several derivations for the same string in the language. An important starting point for the study of such cross-derivational properties is the notion of consistency. The probability model de ned by a probabilistic grammar is said to be consistent if the probabilities assigned to all the strings in the language sum to one. From the literature on probabilistic context-free grammars (CFGs), we know precisely the conditions which ensure that consistency is true for a given CFG. This paper derives the conditions under which a given probabilistic Tree Adjoining Grammar (TAG) can be shown to be consistent. It gives a simple algorithm for checking consistency and gives the formal justi cation for its correctness. The conditions derived here can be used to ensure that probability models that use TAGs can be checked for de ciency (i.e. whether any probability mass is assigned to strings that cannot be generated).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

State-Split for Hypergraphs with an Application to Tree Adjoining Grammars

In this work, we present a generalization of the state-split method to probabilistic hypergraphs. We show how to represent the derivational stucture of probabilistic tree-adjoining grammars by hypergraphs and detail how the generalized state-split procedure can be applied to such representations, yielding a state-split procedure for tree-adjoining grammars.

متن کامل

PreRkTAG: Prediction of RNA Knotted Structures Using Tree Adjoining Grammars

Background: RNA molecules play many important regulatory, catalytic and structural <span style="font-variant: normal; font-style: norma...

متن کامل

Developing a TT-MCTAG for German with an RCG-based Parser

Developing linguistic resources, in particular grammars, is known to be a complex task in itself, because of (amongst others) redundancy and consistency issues. Furthermore some languages can reveal themselves hard to describe because of specific characteristics, e.g. the free word order in German. In this context, we present (i) a framework allowing to describe tree-based grammars, and (ii) an...

متن کامل

Nonparametric Bayesian Inference and Efficient Parsing for Tree-adjoining Grammars

In the line of research extending statistical parsing to more expressive grammar formalisms, we demonstrate for the first time the use of tree-adjoining grammars (TAG). We present a Bayesian nonparametric model for estimating a probabilistic TAG from a parsed corpus, along with novel block sampling methods and approximation transformations for TAG that allow efficient parsing. Our work shows pe...

متن کامل

Multiple Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars

Strong lexicalization is the process of turning a grammar generating trees into an equivalent one, in which all rules contain a terminal leaf. It is known that tree adjoining grammars cannot be strongly lexicalized, whereas the more powerful simple context-free tree grammars can. It is demonstrated that multiple simple context-free tree grammars are as expressive as multi-component tree adjoini...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998